← Back to Contents
Note: This page's design and presentation have been enhanced using Claude (Anthropic's AI assistant) to improve visual quality and educational experience.
Week 3 • Sub-Lesson 4

🌱 Sustainable AI: Practices & Possibilities

What researchers can do, how to measure your footprint, and the other side of the story

What We'll Cover

The previous two sessions established that AI has a real and growing environmental footprint, and that the relationship between efficiency and total consumption is complicated. This session asks: given all of that, what can you actually do?

We will look at concrete tools for measuring the carbon footprint of your own computational work, practical choices that can meaningfully reduce your impact, and — importantly — the other side of the story: the ways AI is being applied to environmental challenges. The goal is not to make you feel either guilty or reassured, but to help you make informed decisions and engage critically with claims about AI and the environment from any direction.

📏 Measuring Your Own Footprint

You cannot reduce what you do not measure. A small ecosystem of open tools has emerged to help researchers estimate and track the carbon footprint of their computational work.

🌱 CodeCarbon

The most widely used tool for tracking emissions from machine learning workloads. CodeCarbon is a Python library that monitors your GPU/CPU power draw during a computation and looks up the carbon intensity of the local electricity grid, producing an estimate in kg CO₂ equivalent.

  • Integration: Add a single decorator to your Python code — it runs in the background and produces a report when your experiment finishes
  • Grid awareness: Automatically adjusts for the carbon intensity of your local grid (or your cloud provider's region)
  • Use case: Ideal for comparing experiments: "did this architectural change reduce my compute footprint by enough to justify the performance gain?"
  • Limitation: Measures direct electricity use; does not account for embodied hardware carbon or data centre overhead beyond PUE estimates

🌍 Green Algorithms

A web-based calculator designed for the broader scientific computing community — not just ML researchers. If you run bioinformatics pipelines, simulations, or any compute-intensive work, Green Algorithms can estimate its footprint.

  • Inputs: Hardware type, number of cores, runtime, memory, location
  • Output: kg CO₂e, plus comparisons to everyday activities to aid communication
  • Peer-reviewed methodology: The underlying paper was published in Advanced Science (2021), giving it more methodological credibility than some tools
  • Particularly useful for: Non-ML researchers who want to understand the footprint of their existing computational work

⚖️ ML CO₂ Impact Calculator

A simpler web-based calculator for estimating the carbon footprint of a planned ML training run, before you run it. Useful for project planning and grant applications where you want to report your expected compute footprint.

  • Inputs: Hardware, training duration, cloud provider/region
  • Output: Estimated kg CO₂e for the full training run
  • Distinction from CodeCarbon: This tool estimates in advance (before running); CodeCarbon measures in real time

💡 Should You Report Your Compute Footprint in Papers?

The "Green AI" movement (Schwartz et al., 2020) has called for academic papers to routinely report the computational cost of the research they present — both in financial and environmental terms. Some venues now encourage or require this; most do not yet mandate it. Reporting your compute footprint in your methods section:

  • Enables reproducibility assessment (others can judge whether they have the resources to replicate your work)
  • Contributes to the community's understanding of aggregate research compute costs
  • Models good practice for your field

Even a rough estimate — "training was performed on X GPUs for Y hours, estimated emissions Z kg CO₂e using CodeCarbon" — is more useful than nothing.

🔧 Practical Choices That Make a Difference

You may not be able to influence how AI companies power their data centres, but you can make choices that meaningfully affect the footprint of your own AI use.

📊 Impact of Different Choices

Choice Potential Carbon Reduction Effort Required
Run cloud workloads in a low-carbon region Up to 80% (Dodge et al., 2022) Low — change a config parameter
Schedule heavy compute during low-carbon grid hours 10–40%, depending on grid variability Low-medium — requires grid carbon awareness
Use a smaller, fit-for-purpose model Significant — scales with model size difference Low — requires benchmarking smaller models first
Use API calls rather than self-hosted large models Variable — hyperscaler efficiency may offset model size Low
Use efficient fine-tuning (LoRA) rather than full fine-tuning Significant for training; minimal for inference Low-medium
Avoid video generation unless necessary ~1,000× per task replaced with text Low — a usage decision
Use extended thinking / reasoning modes only when warranted 10–50× per query (hidden "scratchpad" tokens before responding) Low — a usage decision
Avoid Deep Research mode for straightforward questions Equivalent to 100–1,000 standard queries (20–100+ web searches + synthesis) Low — a usage decision

🌍 The Cloud Region Choice: The Most Impactful Lever

Research by Dodge et al. (2022, Allen Institute for AI, ACM FAccT) found that running identical AI workloads in a low-carbon electricity region vs. a high-carbon region can reduce carbon emissions by up to 80%, with zero change to the model, code, or performance.

Major cloud providers (AWS, Google Cloud, Azure) all allow you to specify the region where your compute runs. Choosing us-west-2 (Oregon, with significant hydroelectric power) over us-east-1 (Virginia, more coal/gas dependent) can make a substantial difference. The same applies to European regions: eu-north-1 (Sweden) is among the cleanest; eu-central-1 (Frankfurt) less so.

Tools like the Electricity Maps API and Google Cloud's Carbon Footprint tool can help you choose the lowest-carbon option in real time.

Choosing the Right Model Size

One of the most consequential and under-appreciated choices: do you need a frontier model, or will a smaller one do?

  • For many research tasks — data extraction, classification, summarisation, code generation — a 7B or 13B parameter model performs nearly as well as a 70B+ model
  • Smaller models are dramatically cheaper to run, both financially and energetically
  • A useful habit: always benchmark the smallest model that could plausibly work before defaulting to the largest available
  • Open-weight models (LLaMA 3, Mistral, Qwen) allow local deployment that avoids sending data to cloud providers — which may matter for sensitive research data as well as energy reasons

Prompting vs. Compute

Before investing in training or fine-tuning, it is worth asking whether better prompting could achieve the same result at a fraction of the compute cost.

  • Well-crafted prompts can match or exceed the performance of a fine-tuned smaller model for many tasks
  • Iterating on prompts uses inference compute (relatively cheap); training uses training compute (more expensive)
  • If a task truly requires fine-tuning, use parameter-efficient methods (LoRA) rather than full fine-tuning — the compute savings are substantial
  • Cache and reuse results: if you are running the same analysis on many documents, caching intermediate outputs avoids redundant compute

Context Window and Iteration Costs

Two less-visible sources of compute cost deserve attention: the amount of text you feed into a model, and how many times you iterate.

  • Context length scales compute: Feeding a 100-page document into a frontier model incurs compute proportional to that length — every token in context is processed on every query
  • The iteration trap: Sending a task to a cheap model, finding the result inadequate, and repeating 8–10 times may cost more than one well-structured prompt to a larger model
  • Front-load prompt quality: Investing time in a precise, well-specified prompt typically produces better results with lower total compute than cheap-model iteration
  • Agent workflows compound costs: Multi-step agent systems where AI calls tools, reads results, and generates new queries can multiply energy costs in ways that are not obvious from the interface

🔬 The Hidden Energy Cost of Frontier Model Features

Many advanced AI features are designed to improve output quality — but each carries an energy cost that is largely invisible in the user interface. The figures below are estimates based on available published evidence (noted in parentheses); some platform costs are not publicly disclosed.

Feature What It Does Approximate Token / Compute Overhead
Standard prompt Single forward pass; model reads prompt, generates response 1× (baseline)
Chain-of-thought prompting Model reasons step-by-step in the visible response, generating significantly more output tokens ~3–8× more tokens (Wharton GAIL, 2024: CoT requests take 35–600% longer than direct requests)
Extended thinking / reasoning models (o1, o3, Claude with thinking enabled) Generates thousands of hidden "scratchpad" tokens before producing a visible response; users see only the final answer 10–20× more tokens; 50–130× higher API cost (Epoch AI, 2024; OpenAI pricing)
Deep Research mode (Gemini, Perplexity; ChatGPT costs not published) Makes 80–160+ web searches, reads full pages, synthesises across sources; can run for 5–30 minutes 250k–900k input tokens per query; ~$3/run (Google Gemini API docs; Perplexity pricing docs)
Multi-step agent workflows Each tool call, web search, or sub-task triggers additional model calls; costs compound with workflow length 10–50× more tokens than a single-shot query (arXiv 2506.04301, 2025)
Large context (long documents) Transformer attention scales quadratically with sequence length; doubling context roughly quadruples compute O(n²) scaling — theoretically proven, not an estimate (Duman-Keles et al., COLT 2023)

💡 When Are These Features Worth It?

Extended thinking and Deep Research exist because they genuinely improve results on hard problems. A Deep Research session synthesising 50 sources for a complex literature question may save many hours of manual work — and may well be the lower-energy option when your time is factored in.

The concern is reflexive use: enabling reasoning modes or triggering Deep Research for questions that a standard prompt would handle adequately. As with model size, the principle is the same: match capability to complexity, and be intentional rather than defaulting to the most powerful available option.

🌿 The Other Side: AI for Environmental Solutions

A responsible treatment of AI's environmental impact requires engaging with both sides. AI is not only a consumer of energy — it is also being actively applied to some of the world's most pressing environmental challenges. These applications do not cancel out AI's footprint, but they are part of the full picture.

📄 Foundational Reading: Tackling Climate Change with Machine Learning

Rolnick et al. (2022): "Tackling Climate Change with Machine Learning" — ACM Computing Surveys. A landmark survey paper by 22 researchers outlining 13 areas where ML can meaningfully contribute to climate mitigation and adaptation. Essential reading for a balanced view of AI's relationship with the environment.

Energy Systems

  • Grid optimisation: ML can predict demand and balance supply more efficiently, enabling higher penetrations of variable renewable energy
  • DeepMind + Google: A reinforcement learning system reduced cooling energy use in Google's data centres by ~40% — a real example of AI reducing its own infrastructure's footprint
  • Smart charging: AI-coordinated EV charging can avoid peak grid demand and shift load to renewable-rich periods
  • Demand forecasting: Better predictions of energy demand reduce the need for fossil fuel peaker plants

Climate Science and Modelling

  • Weather prediction: Google DeepMind's GraphCast and NVIDIA's FourCastNet produce weather forecasts faster and at lower compute cost than traditional numerical models
  • Climate downscaling: ML emulators can produce high-resolution regional climate projections from coarser global models at a fraction of the compute cost
  • Extreme event prediction: Improved forecasting of floods, droughts, and wildfires with longer lead times
  • Carbon monitoring: Satellite imagery analysis for tracking deforestation, methane leaks, and land use change

Science and Materials

  • Materials discovery: AlphaFold's success with protein structure has inspired similar approaches for discovering new materials for batteries, solar cells, and carbon capture
  • Drug discovery: AI-accelerated drug development has the potential to address diseases exacerbated by climate change
  • Agricultural optimisation: Precision agriculture AI can reduce fertiliser use (and associated N₂O emissions) while maintaining yields
  • Carbon capture: ML can help identify and optimise direct air capture materials and processes

⚖️ How to Think About the Trade-off

The existence of beneficial AI applications for climate does not automatically justify AI's overall energy footprint — that would require demonstrating that the net impact is positive, which requires careful case-by-case analysis. Some questions worth asking:

  • Does this specific AI application actually require a frontier model, or could a much smaller system do the job?
  • What is the counterfactual? Would the same outcome be achieved without AI, using less energy?
  • Are the emissions savings from the application larger than the emissions from running the AI system itself?
  • Who bears the environmental costs (global electricity + hardware) vs. who receives the benefits (specific sectors or regions)?

These are genuinely difficult questions, and the honest answer in most cases is that we don't yet have the data to answer them precisely.

🏛️ The Policy Landscape

Individual choices by researchers matter, but the scale of AI's environmental impact requires systemic responses. Here is where policy is currently heading.

Disclosure Requirements

  • EU AI Act: Includes provisions for transparency about compute resources and energy use for certain high-impact AI systems
  • SEC climate disclosures (US): Require large publicly-listed companies to disclose material climate risks, which can include data centre energy use
  • Voluntary frameworks: The Partnership on AI has proposed a framework for AI environmental impact disclosure — currently voluntary, not mandated
  • The gap: Most disclosure regimes apply to companies, not to AI products or models specifically

What Researchers Can Advocate For

  • Mandatory model cards: Reporting of compute, energy use, and carbon footprint alongside model releases — similar to nutrition labels
  • Journal and conference standards: Requiring compute reporting in publications, as some ML venues are beginning to implement
  • Procurement standards: Universities and research funders setting green criteria for AI services they purchase
  • Third-party verification: Supporting independent auditing of AI company environmental claims rather than relying on self-reporting

🌍 Africa and the AU: A Significant Policy Gap

The African Union Continental AI Strategy (adopted July 2024) is the continent's primary AI governance framework. Its environmental provisions are minimal but worth noting:

  • Water scarcity acknowledged: The strategy explicitly states that data centre cooling "poses a threat to regions already facing water scarcity" — directly relevant to an African context
  • High-risk AI framework: "Environmental impact" is listed as a risk dimension for high-risk AI systems, in principle requiring impact assessments
  • General principle: AI development should not harm "the environment" — stated without specific implementation mechanisms
  • AU Data Policy Framework (2022): The AU's data governance framework contains no meaningful environmental provisions

A 2025 analysis by TechPolicy.Press examining 14 African AI strategies found that only three acknowledged environmental consequences at all, and that the AU's single mention of water scarcity "provides no concrete solutions or mitigation plans." This gap is significant: Africa is among the world's most climate-vulnerable regions, and countries like South Africa run electricity grids that are among the world's most carbon-intensive (historically over 80% coal). AI workloads and data centres in the region inherit that carbon intensity — yet current continental AI governance does not address this.

UNESCO and the World Bank have been working with AU policymakers on "Green AI" pathways, and the policy gap may narrow as the Continental Strategy moves through implementation (Phase 1: 2025–2026; Phase 2: 2028–2030). For now, it represents an area where researchers — particularly those based in Africa — can meaningfully contribute to policy advocacy.

📄 Policy and Measurement Resources

OECD (2022): "Measuring the Environmental Impacts of Artificial Intelligence Compute and Applications" — policy-focused analysis of methodological challenges and governance options.

Schwartz et al. (2020): "Green AI" (Communications of the ACM) — the foundational paper calling for the research community to report efficiency metrics alongside performance. Short and accessible.

Dodge et al. (2022): "Quantifying the Carbon Intensity of AI in Cloud Instances" — the paper demonstrating the 80% emissions reduction available from cloud region choice.

🧭 A Framework for Your Own Decisions

Rather than a set of rules, here is a set of questions you can ask yourself when deciding how to use AI in your research.

Questions to Ask Before Running Compute-Intensive AI Work

  1. Do I actually need AI for this task? Could a simpler statistical method, a keyword search, or manual analysis achieve the same result with less compute?
  2. Do I need a frontier model — and do I need its most powerful features? Would a 7B parameter open-weight model do the job? If using a frontier model, do I need extended thinking, Deep Research, or agent workflows — or will a standard prompt suffice? Have I tested simpler approaches first?
  3. Can I measure the footprint? Can I add CodeCarbon to this experiment, or use the Green Algorithms calculator to estimate it?
  4. Where is my compute running? If using cloud, am I in the lowest-carbon region available for this task?
  5. Can I time this work? For non-urgent batch work, can I run it at low-demand grid hours or when renewable generation is high?
  6. Am I caching or repeating? Am I running the same analysis multiple times unnecessarily?
  7. Will I report this? If this is publishable research, will I include the compute footprint in the methods section?

💡 Proportionality: Don't Catastrophise, Don't Dismiss

Two failure modes are common in discussions of AI and the environment. The first is catastrophising: treating every ChatGPT query as an environmental emergency. The second is dismissing: pointing to AI's small per-query footprint as a reason to ignore the issue entirely.

The proportionate response is to:

  • Recognise that aggregate scale is what matters, not individual queries in isolation
  • Make the low-effort choices (model size, cloud region, caching) as a matter of routine
  • Apply more scrutiny to high-compute work (training runs, large-scale inference pipelines)
  • Engage critically with both industry claims and media alarmism
  • Support systemic changes (disclosure requirements, research community norms) that address the problem at scale

📚 Week 3 Summary: Environmental Implications of AI

Across the four sessions this week, we have built a picture of AI's environmental footprint that is neither comforting nor catastrophic, but honest about what we know and what we don't:

  • The data are uncertain: Corporate opacity means all figures are estimates; treat them as order-of-magnitude guides
  • Critical minerals underpin all AI hardware: Silicon, cobalt, rare earth elements, and lithium are geographically concentrated, ethically contested, and environmentally costly to extract — a supply chain dimension often absent from mainstream AI debate
  • Text vs. video is not comparable: Video generation uses ~1,000× more energy than text; these are categorically different use cases
  • Location is the biggest lever: Where your compute runs determines up to 80% of its carbon footprint
  • Embodied carbon is missing from most analyses: Manufacturing hardware releases significant carbon before a single query is run
  • The Jevons paradox is real: Efficiency gains in AI have so far been outpaced by growth in deployment and model scale
  • AI can also help: Genuine applications in climate science, energy systems, and materials research — though these do not automatically offset AI's footprint
  • Researchers can act: Model choice, cloud region, reporting, and advocacy are all within your control

Next week (Week 4): We move from environmental implications to the broader ethical landscape — ethical frameworks for using AI in research, including questions of transparency, privacy, bias, and academic integrity.